Fast 3D Block Parallelisation for the Matrix Multiplication Prefix Problem - Application in Quantum Control

نویسندگان

  • K. Waldherr
  • T. Schulte-Herbrüggen
چکیده

For exploiting the power of supercomputers like the HLRB-II cluster, developing parallel algorithms becomes increasingly important. The matrix prefix problem belongs to a class of issues lending themselves for parallelisation. We compare the tree-based parallel prefix scheme, which is adapted from a recursive approach, with a sequential multiplication scheme where only the individual matrix multiplications are parallelised. We show that this fine-grain approach outperforms the parallel prefix scheme by a factor of 2−3 and also leads to less memory requirements. Unlike the tree-based scheme, the fine-grain approach enables many options in the choice of the number of parallel processors and shows a better speedup performance when increasing the matrix sizes. The usage of the fine-grain approach in a quantum control algorithm instead of the coarse-grain approach allows us both to deal with systems of higher dimensions and to choose a finer discretisation. Introduction: The Prefix Problem In general, the prefix problem is given as follows: Let ◦ be a binary operator and A a set, which is closed under the operator ◦. Furthermore, let ◦ be associative, i.e. the identity (a ◦ b) ◦ c = a ◦ (b ◦ c) holds for all a,b,c ∈ A . Then, for given elements x1, . . . ,xM ∈A , the prefix problem means the computation of all the products yi = x1 ◦ · · · ◦ xi . (1) Analogously, the suffix problem amounts to computing of all the products K. Waldherr, T. Huckle and T. Auckenthaler Dept. of Computer Science, TU Munich, D-85748 Garching, Germany, e-mail: [email protected] U. Sander and T. Schulte-Herbrüggen Dept. of Chemistry, TU Munich, D-85747 Garching, Germany, e-mail: [email protected]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Matrix exponentials and parallel prefix computation in a quantum control problem

Quantum control plays a key role in quantum technology, in particular for steering quantum systems. As problem size grows exponentially with the system size, it is necessary to deal with fast numerical algorithms and implementations. We improved an existing code for quantum control concerning two linear algebra tasks: The computation of the matrix exponential and efficient parallelisation of pr...

متن کامل

Parallelisation of Block-Recursive Matrix Multiplication in Prefix Computations

c © 2007 by John von Neumann Institute for Computing Permission to make digital or hard copies of portions of this work for personal or classroom use is granted provided that the copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise requires prior specific permission by the publisher ment...

متن کامل

Modified 32-Bit Shift-Add Multiplier Design for Low Power Application

Multiplication is a basic operation in any signal processing application. Multiplication is the most important one among the four arithmetic operations like addition, subtraction, and division. Multipliers are usually hardware intensive, and the main parameters of concern are high speed, low cost, and less VLSI area. The propagation time and power consumption in the multiplier are always high. ...

متن کامل

Task-Based Algorithm for Matrix Multiplication: A Step Towards Block-Sparse Tensor Computing

Distributed-memory matrix multiplication (MM) is a key element of algorithms in many domains (machine learning, quantum physics). Conventional algorithms for dense MM rely on regular/uniform data decomposition to ensure load balance. These traits conflict with the irregular structure (block-sparse or rank-sparse within blocks) that is increasingly relevant for fast methods in quantum physics. T...

متن کامل

Parallelising Matrix Operations on Clusters for an Optimal Control-Based Quantum Compiler

Quantum control plays a key role in quantum technology, e.g. for steering quantum hardware systems, spectrometers or superconducting solid-state devices. In terms of computation, quantum systems provide a unique potential for coherent parallelisation that may exponentially speed up algorithms as in Shor’s prime factorisation. Translating quantum software into a sequence of classical controls st...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009